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Abstract 

According to different typologies of activity and priority, risks can assume diverse meanings and 
it can be assessed in different ways. 

In general risk is measured in terms of a probability combination of an event (frequency) and 
its consequence (impact). To estimate the frequency and the impact (severity) historical data or 
expert opinions (either qualitative or quantitative data) are used. Moreover qualitative data must 
be converted in numerical values to be used in the model. 

In the case of enterprise risk assessment the considered risks are, for instance, strategic, opera- 
tional, legal and of image, which many times are difficult to be quantified. So in most cases only 
expert data, gathered by scorecard approaches, are available for risk analysis. 

The Bayesian Network is a useful tool to integrate different information and in particular to 
study the risk's joint distribution by using data collected from experts. 

In this paper we want to show a possible approach for building a Bayesian networks in the 
particular case in which only prior probabilities of node states and marginal correlations between 
nodes are available, and when the variables have only two states. 
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INTRODUCTION 



A Bayesian Net (BN) is a directed acyclic graph (probabilistic expert system) in which 
every node represents a random variable with a discrete or continuous state 

The relationships among variables, pointed out by arcs, are interpreted in terms of con- 
ditional probabilities according to Bayes theorem. 

With the BN is implemented the concept of conditional independence that allows the 
factorization of the joint probability, through the Markov property, in a series of local terms 
that describe the relationships among variables: 

where »a(xj) denotes the states of the predecessors (parents) of the variable Xj (child) 
|l, 0,0,0]. This factorization enable us to study the network locally. 

A Bayesian Network requires an appropriate database to extract the conditional probabil- 
ities (parameter learning problem) and the network structure (structural learning problem) 

Q,SHQ. 

The objective is to find the net that best approximates the joint probabilities and the 
dependencies among variables. 

After we have constructed the network one of the common goal of bayesian network is 
the probabilistic inference to estimate the state probabilities of nodes given the knowledge 
of the values of others nodes. The inference can be done from children to parents (this is 
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called diagnosis) or vice versa from parents to children (this is called prediction) 

However in many cases the data are not available because the examined events can be 
new, rare, complex or little understood. In such conditions experts' opinions are used for 



collecting information that will be translated in conditional pro 



11 



Id, HQ. 



3ability values or in a certain 



joint or prior distribution (Probability Elicitation) 

Such problems are more evident in the case in which the expert is requested to define 
too many conditional probabilities due to the number of the variable's parents. So, when 
possible, is worthwhile to reduce the number of probabilities to be specified by assuming 
some relationships that impose bonds on the interactions between parents and children as 



for example the noisy-OR and its variation and genralization 



IG 
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In the business field, Bayesian Nets are a useful tool for a multivariate and integrated 
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analysis of the risks, for their monitoring and for the evaluation of intervention strategies 



(by decision graph) for their mitigation 



Enterprise risk can be defined as the possibility that something with an impact on the 
objectives happens, and it is measured in terms of combination of probability of an event 
(frequency) and of its consequence (impact). 

The enterprise risk assessment is a part of Enterprise Risk Management (ERM) where 
to estimate the frequency and the impact distributions historical data as well as expert 



mm. 



Then such distributions are combined to get the loss 



opinions are typically used 
distribution. 

In this context Bayesian Nets are a useful tool to integrate historical data with those 
coming from experts which can be qualitative or quantitative jigj ]. 



OUR PROPOSAL 



What we present in this work is the construction of a Bayesian Net for having an inte- 
grated view of the risks involved in the building of an important structure in Italy, where the 
risk frequencies and impacts were collected by an ERM procedure unsing expert opinions. 

We have constructed the network by using an already existing database (DB) where the 
available information are the risks with their frequencies, impacts and correlation among 
them. In total there are about 300 risks. 

In our work we have considered only the frequencies of risks and no impacts. With our 
BN we construct the risks' joint probability and the impacts could be used in a later phase 
of scenario analysis to evaluate the loss distribution under the different scenarios j^. 

In table 1 there is the DB structure used for network learing and in which each risk is 
considered as a binary variable (one if the risk exists ( yes ) and zero if the risk dosen't exist 
(not)). Therefore, for each considered risk in the network there will be one node with two 
states (one = Y and zero = N). 



TABLE I: Expert values database structure (Learning table) 



PARENT 


CHILD 


CORRELATION 


PARENT FREQ. 


CHILD FREQ. 


RISK A 


RISK B 


PAB = 0.5 


P(risk A = Yes)=0.85 


P(risk B = yes;=0.35 


RISK A 


RISK C 


PAC = 0.3 


P(risk A = yes;=0.85 


P(risk C = Yesj=0.55 



The task is, therefore, to find the conditional probabilities tables by using only the cor- 
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relations and the marginal frequencies. Instead, the net structure is obtained from table 1 
by following the node relationships given by correlations. 

The main ideas for finding a way to construct a BN have been: first to find the joint 
probabilities as functions of only the correlations and the marginal probabilities; second to 
understand how the correlations are linked with the incremental ratios or the derivatives 
of the child's probabilities as functions of the parent's probabilities. This choice is due to 
the fact that parent and child interact through the values of conditional probabilities; the 
derivatives are directly linked to such probabilities and, therefore, to the degree of interac- 
tion between the two nodes and, hence with the correlation. 

Afterwards we have understood as to create equations, for the case with dependent par- 
ents we have used the local network topology to set the equations. 

We have been able to calculate the CPT up to three parents for each child. Although 
there is the possibility to generalize to more than three parents, it is necessary to have more 
data which are not available in our DB. So when four or more parents are present we have 
decided to divide and reduce to cases with no more than three parents. To approximate 
the network we have "separated" the nodes that give the same effects on the child (as for 
example the same correlations) by using auxiliary nodes [3J. When there was more than 
one possible scheme available we have used the mutual information (MI) criterion as a dis- 
criminating index by selecting the approximation with the highest total MI; this is the same 
to choose the structure with the minimum distance between the network and the target 
distribution HQ- 

We have analyzed first the case with only one parent to understand the framework, then 
it has been seen what happens with two independent parents and then dependent. Finally 
we have used the analogies between the cases with one and two parents for setting the 
equations for three parents. 

One parent case solution 

The case with one parent (figure 1) is the simplest. Let P(F) and P(C) be the marginal 
probability given from expert (as in table 1): 

• For the parent, F, we have: P(F=Y)=x, P(F=N)=l-x; 
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For the child, C, we have: P(C=Y)=y, P(C=N)=l-y; 



FIG. 1: One parent scheme 



The equations to find either the conditional probabilities or the joint probabilities are: 



CPT equation system 
a\x + 02(1 — = y\ 
ai — 02 = fe; 
0.1+0.3 = 1; 
02 + 04, = 1; 



.Joint equation system 

ci = pM + xy\ 

C2 = y- pM - xy; 

C3 = X — pM — xy; 

C4 = l- y- x + pM + xy; 



where k — P-^/^^^pj; whit ctj and q we indicate respectively the conditional and the joint 
probabilities. 

Considering that probabilities q and ctj must be positive either the marginal probabilities 
or the correlation value should be constrained. If the marginal probabilities are fixed then 
the correlation values must be constrained, which will be normally the estimates of 

probabilities are more reliable. 

It is not possible to have any value of correlation given the marginal probabilities. Indeed, 
as we want to maintain the marginal probabilities as fixed by the expert, correlation limits 
can be determined as follows: 

„^_a:j/_4. ^-^ y+x{l-y)-l _ 7-). ^ x{l-y) _ d. „ ^ yj^-x) _ ^. 
P > M — ^' P > M — P < M — P < M ~ 



and the correlation interval will be: 



p e [max{A, D);min{B, C)] ; 



Two parents case solutions 



This case (figure 2) is more complicated than the one before. In this situation a further 
difficulty is that the given expert correlations are only pairwise marginal and, therefore, we 
need more information to find the CPT. 
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For example the joint moment among the nodes which is not in the DB. Consequently 
there can be more than one CPT, corresponding to different values of the joint moment, for 
the same marginal correlation and probability. 




FIG. 2: Two independent parents (a) and dependent parents (b) 

The joint moment becomes thus a project parameter to be set by using an appropriate 
criterion. We define the standardized joint moment among three variables to be: 

^ Em-ElN,]}(Nj-ElNj]){N,,-ElNk])] 
PN.NjNk ^ Var [Ni] Var [Nj ] Var [iVfc] 

To choose among such CPTs we have used the total mutual information (Itotai) by select- 
ing that CPT with the PNiNjNk that gives the minimum Itotai- 

In this case we have to distinguish between independent and dependent parents. The 
solutions are: 

Joint equation system for dependent parents 
ci = pabfMabf + xyz + {pafMaf)z+ 

+ {PBfMbf)x + {pABMAB)y 

ci + C5 = pbfMbf + zy 
ci + C3 = pafMaf + xy 
ci + C2 = pabMab + xz 

Cl + C2 + C3 + C4 = X 
Cl + C5 + C6 + C2 = 2 
Cl + C3 + C5 + C7 = y 
C8 = 1 - Ei=l Ci 

where Mbf = ^/ z{l - z)y{l - y), Maf = ^yx{l-x)y{l-y), Mab = ~ - ^) 

and Mabf = — x)z{l — z)y{l — y). As before the ctj and q are respectively the 

conditional and the joint probabilities. 

The problem is now setting the marginal correlations when those given from experts are 
not consistent with the marginal probabilities. Differently from the case with one parent 
where the correlation belongs to an interval, with two parents the admissible pairs {paf-, Pbf) 



CPT equation system for independent parents 
f{x, z) = (ai - a2 - 03 + a4,)xz + (02 - ai)x+ 
+(03 - a4)z + 04 = y 

9/ - (ai - a2 - Q3 + a4)z + (aa - a^) = ^"""Sl^^if"^ 



II = (ai - aa - a3 + a4)x + {as - 04) = ^""^({^^^f'' 

1^ = 1^ = (ai - a2 - a3 + a4) = (p^^k)(Mab^} 

axoz ozax ^ o 1 x(l— a:)z(l — 2) 

ai + as = 1 

02 + "6 = 1 
0:3 + 0:7 = 1 

a4 + as = 1 
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can be shown to belong to an area. 

To approach this problem we have decided to decrease the values of the two correlations 
Pbf and paf with a fixed step by maintaining their relative difference. At each step we 
verified the existence of a value of pabf which supports the new pair {paf, Pbf)- If it exists 
the process is stopped, otherwise it goes to the next step; and so on. 

If the correlation pab is different from zero (dependent parents) , we can set it in advance 
using the interval obtained for the case of one parent; afterward the pab's value is used 
into the joint equation system. Then we can work again only on the pair {paf,Pbf) by 
considering the same procedure for independent parents and selecting pNiNjN^- 

Three parents case solutions 

As before two equation systems are obtained. One system for the case with independent 
(see figure 3 a) parents by which the CPT is directly calculated; another one when there are 
some correlations between parents (see figure 3 b) and in this case the joint probabilities 
are calculated instead of the conditional ones. To define the equation systems the analogies 
between the cases with one and two parents have been exploited. 



The solutions for independent and dependent parents are in table 2. In such equations, 
obviously, there will be more missing data which are all the standardized joint moments 
among every two parents and the child and among all parents and the child. So what we do 
in such a situation is to use the procedure for the case of two parents for each pair of nodes 
and set the correlation values such that they will be feasible for the all couples. Note that 
the correlation levels are now less than in previous cases. Moreover in this case the stan- 
dardized joint moment among all variables is set at zero to make the research less complex. 

Furthermore, difficulties arise when there are large differences among the parents' 




r 



FIG. 3: Three independent parents (a) and dependent parents (b). 
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marginal probabilities. Therefore, when there are more than three parents, we have de- 
cided to spli t them. Parents are split from the others by using the mutual information 



criterion 
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As before, for the case of dependent parents to select the feasible marginal correlations 
and the standardized joint moments, we start to look for the admissible correlation between 
the nodes with one parent (A and B), then for the nodes with two parents (C has B and 
A as predecessor) and finally we set the joint moment and marginal correlations for the 
node with three parents (F). Obviously, now the procedure is more complex and it is more 
difficult to select the parameters. 



TABLE II: Equation systems for three parents scheme 



CPT equation system for independent parents 
fix, z, w) = (ai — a2 — as + ct4 — as + ae + ctT — as)xzw+ 
+{o2 — a4 — ae + aa)xz + (as - 04 - «7 + as)wz+ 
+{05 - as - a? + as)wz + [ae — aa)z + (04 - as)x+ 
+(q7 — as)w + as = y 

as _ iPAF)\/x(,l-x)y(l-y) 
dx x(l — x) 

aj_ _ (PBF)\/Mi-^)y(i-y) 

dz z(l-z) 

df _ (pcj-)\/™(l-™)i/(l-t() 
dw w{l — w) 

d^f _ (pabf) ^x(l-x)z(l-z)y(l^y) 
dxdz x(l — x)z{\ — z) 

a'^f _ (pabf) ^w(l-w)z(l-z)y(l-y) 
dwdz w(l — w)z{l — z) 

a'^f _ (pabf) ^iu(l-w)x(l-x)y(l-y) 
axdw x(\ — x)w{\ — w) 



a^ f _ (pabCf) ^w(l-w)z(l-z)y(l-y)x(l-x) 
axdwaz x{l — x)z(\ — z)'w(l — w) 

1 



ai 

04 
05 



■ «9 

■ aio = 1 
oil = 1 

■ 012 = 1 
013 = 1 



Ofi + Ol4 = 1 

07 + 015 = 1 
ag, + 016 = 1 



Joint equation system for dependent parents 
ci = PabcfMabcf + xyzw + pABC^ABcy + PabfMabfw+ 
+PacfMacfz + PbcfMbcfx + pABMABWy+ 
+pAcMACZy + PbcMbC^V + pAFMAFXy+ 

+PbfMbpxw + pcfMcf^z 

Cl + C2 + C3 + C4 + C5 + C6 + C7 + C8 = Ul 
Cl + C2 + C3 + C4 + eg + Clo + Cii + Ci2 = X 
Cl + C2 + C5 + C6 + eg + ClO + Ci3 + C14 = 2 
Cl + C3 + C5 + C7 + eg + Cii + Ci3 + Ci5 = {/ 

Cl + C2 = pabcMabc + xyz + pabMabw + pacMac^ + pbcMbc^ 
Cl + eg = pabfMabf + xyz + pafMafz + PbfMbfx + pabMabV 
Cl + C3 = pacfMacf + xyz + pafMafw + pcfMcfx + pacMacV 
Cl + C5 = pbcfMbcf + xyz + pbfMbfw + pcfMcfz + Pbc^bcV 

Cl + C2 + eg + CIO = PAbMab + XZ 
Cl + C2 + C3 + C4 = PacI^IaC + 
Cl + C3 + eg + cii = pafMaf + xy 
Cl + e2 + es + C6 = PbcMbc + zw 
Cl + e3 + C5 + C7 = pcfMcf + wy 
Cl + C5 + eg + C13 = pbfMbf + zy 

C16 = l-El^C, 



CONCLUSION 

So far we have seen that using the equation systems for conditional and joint probabilities 
the CPTs can be obtained. The method can be generalized to the case with more three 
parents, but there are problems in setting more parameters (standardized joint moment) 
and in looking for more complicated feasible marginal correlation areas. 
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So to develop a network we propose to use, separately, firstly the equations and procedure 
for the one parent; secondly those for two parents distinguishing when they are dependent 
and not. Finally we use the equations and the procedures, when possible, for the three 
parents case by distinguishing also in this situation between dependent and independent 
parents; otherwise we split one parent from the others by using the mutual information as 



splitting index 
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We remark that we need to reduce to a more simple case those configurations with more 
than three parents. We can achieve this trying to estimate a local approximate structure, 
with only one, two and three parents, by "separating" those that give different effects on 
the child (as for instance different incremental ratios). If there are more schemes available 
for the substitution we select that with the highest MI [I total) jl?!- llS^. 

It is important to be noted that such method is modular, this is if we add or delete a 
node we can use the appropiate system (one, two or three nodes) to according to we add or 
delete a parent or a child. 
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